Maximum Entropy Semi-Supervised Inverse Reinforcement Learning
نویسندگان
چکیده
A popular approach to apprenticeship learning (AL) is to formulate it as an inverse reinforcement learning (IRL) problem. The MaxEnt-IRL algorithm successfully integrates the maximum entropy principle into IRL and unlike its predecessors, it resolves the ambiguity arising from the fact that a possibly large number of policies could match the expert’s behavior. In this paper, we study an AL setting in which in addition to the expert’s trajectories, a number of unsupervised trajectories is available. We introduce MESSI, a novel algorithm that combines MaxEnt-IRL with principles coming from semi-supervised learning. In particular, MESSI integrates the unsupervised data into the MaxEnt-IRL framework using a pairwise penalty on trajectories. Empirical results in a highway driving and grid-world problems indicate that MESSI is able to take advantage of the unsupervised trajectories and improve the performance of MaxEnt-IRL.
منابع مشابه
Graph Based Semi-Supervised Approach For Information Extraction
Classification techniques deploy supervised labeled instances to train classifiers for various classification problems. However labeled instances are limited, expensive, and time consuming to obtain, due to the need of experienced human annotators. Meanwhile large amount of unlabeled data is usually easy to obtain. Semi-supervised learning addresses the problem of utilizing unlabeled data along...
متن کاملSemi-Supervised Learning via Generalized Maximum Entropy
Various supervised inference methods can be analyzed as convex duals of the generalized maximum entropy (MaxEnt) framework. Generalized MaxEnt aims to find a distribution that maximizes an entropy function while respecting prior information represented as potential functions in miscellaneous forms of constraints and/or penalties. We extend this framework to semi-supervised learning by incorpora...
متن کاملA Maximum Entropy Approach to Semi-supervised Learning
Various supervised inference methods can be analyzed as convex duals of a generalized maximum entropy framework, where the goal is to find a distribution with maximum entropy subject to the moment matching constraints on the data. We extend this framework to semi-supervised learning using two approaches: 1) by incorporating unlabeled data into the data constraints and 2) by imposing similarity ...
متن کاملGeneralized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data
In this paper, we present an overview of generalized expectation criteria (GE), a simple, robust, scalable method for semi-supervised training using weakly-labeled data. GE fits model parameters by favoring models that match certain expectation constraints, such as marginal label distributions, on the unlabeled data. This paper shows how to apply generalized expectation criteria to two classes ...
متن کاملSemi-supervised learning for text classification using feature affinity regularization
Most conventional semi-supervised learning methods attempt to directly include unlabeled data into training objectives. This paper presents an alternative approach that learns feature affinity information from unlabeled data, which is incorporated into the training objective as regularization of a maximum entropy model. The regularization favors models for which correlated features have similar...
متن کامل